7 research outputs found

    Studio e implementazione di un sistema ensemble per il parsing dell'italiano

    Get PDF
    In questo lavoro di tesi sono state valutate le prestazioni di otto parser che considerano modelli di analisi sintattica dipendente con un'architettura basata su reti neurali deep. Utilizzando due corpora in lingua italiana presenti nelle Universal Dependencies, uno di dominio generico e l'altro di dominio social media, nello specifico Twitter, si è sperimentato come l'apprendimento dal corpus di dominio social media porta ad un significativo incremento dell'accuratezza di parsing rispetto all'apprendimento dal corpus di dominio generico, entrambi valutati sul dominio social media. Inoltre, si è mostrato come utilizzando più dati nel corpus di apprendimento, inclusi i dati di dominio, si riesce ad ottenere un ulteriore miglioramento delle prestazioni dei parser. In seguito, utilizzando i modelli di parsing già appresi, si sono sperimentate differenti tecniche di ensemble allo scopo di combinare i modelli e le predizioni dei singoli parser per migliorare le prestazioni dei singoli modelli e superare la valutazione del miglior parser singolo ottenuto in precedenza. Dai risultati è emerso che utilizzare modelli combinati per il dominio social media fornisce un significativo incremento delle prestazioni rispetto ai tesi di dominio generico

    State-of-the-art Italian dependency parsers based on neural and ensemble systems

    Get PDF
    In this paper we present a work which aims to test the most advanced, state-of-the-art syntactic dependency parsers based on deep neural networks (DNN) on Italian. We made a large set of experiments by using two Italian treebanks containing different text types downloaded from the Universal Dependencies project and propose a new solution based on ensemble systems. We implemented the proposed ensemble solutions by testing different techniques described in literature, obtaining very good parsing results, well above the state of the art for Italian

    Parsing Italian texts together is better than parsing them alone!

    Get PDF
    In this paper we present a work aimed at testing the most advanced, state-of-the-art syntactic parsers based on deep neural networks (DNN) on Italian. We made a set of experiments by using the Universal Dependencies benchmarks and propose a new solution based on ensemble systems obtaining very good performances.In questo contributo presentia-mo alcuni esperimenti volti a verificare le prestazioni dei più avanzati parser sintattici sull’italiano utilizzando i treebank disponibili nell’ambito delle Universal Dependencies. Proponiamo inoltre un nuovo sistema basato sull’ ensemble par-sing che ha mostrato ottime prestazioni

    Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    Reducing the environmental impact of surgery on a global scale: systematic review and co-prioritization with healthcare workers in 132 countries

    Get PDF
    Abstract Background Healthcare cannot achieve net-zero carbon without addressing operating theatres. The aim of this study was to prioritize feasible interventions to reduce the environmental impact of operating theatres. Methods This study adopted a four-phase Delphi consensus co-prioritization methodology. In phase 1, a systematic review of published interventions and global consultation of perioperative healthcare professionals were used to longlist interventions. In phase 2, iterative thematic analysis consolidated comparable interventions into a shortlist. In phase 3, the shortlist was co-prioritized based on patient and clinician views on acceptability, feasibility, and safety. In phase 4, ranked lists of interventions were presented by their relevance to high-income countries and low–middle-income countries. Results In phase 1, 43 interventions were identified, which had low uptake in practice according to 3042 professionals globally. In phase 2, a shortlist of 15 intervention domains was generated. In phase 3, interventions were deemed acceptable for more than 90 per cent of patients except for reducing general anaesthesia (84 per cent) and re-sterilization of ‘single-use’ consumables (86 per cent). In phase 4, the top three shortlisted interventions for high-income countries were: introducing recycling; reducing use of anaesthetic gases; and appropriate clinical waste processing. In phase 4, the top three shortlisted interventions for low–middle-income countries were: introducing reusable surgical devices; reducing use of consumables; and reducing the use of general anaesthesia. Conclusion This is a step toward environmentally sustainable operating environments with actionable interventions applicable to both high– and low–middle–income countries

    PoSTWITA-UD: an Italian Twitter Treebank in Universal Dependencies

    No full text
    Due to the spread of social media-based applications and the challenges posed by the treatment of social media texts in NLP tools, tailored approaches and ad hoc resources are required to provide the proper coverage of specific linguistic phenomena. Various attempts to produce this kind of specialized resources and tools are described in literature. However, most of these attempts mainly focus on PoS-tagged corpora and only a few of them deal with syntactic annotation. This is particularly true for the Italian language, for which such a resource is currently missing. We thus propose the development of PoSTWITA-UD, a collection of tweets annotated according to a well-known dependency-based annotation format: the Universal Dependencies. The goal of this work is manifold, and it mainly consists in creating a resource that, especially for Italian, can be exploited for the training of NLP systems so as to enhance their performance on social media texts. In this paper we focus on the current state of the resource
    corecore